ci: fix Check job status to detect skipped results (fixes #2208)#2210
Conversation
…esult Fixes NVIDIA#2208. The previous `Check job status` body only treated a dep's `result == 'cancelled' || == 'failure'` as failure, letting `'skipped'` slip through silently. When a `build-*` job fails, the dependent `test-*` job is set to `'skipped'` by default needs-failure propagation, and the aggregator passes -- exactly the case demonstrated by NVIDIA#2209. Adopt CCCL's `check_result` pattern: explicit `expected="success"` per dependency, with `expected="skipped"` for legitimate `[doc-only]` skips, and an early short-circuit for `[no-ci]`. Now any deviation from the expected status (including `'skipped'` from a failed upstream) fails the aggregator. Reference: NVIDIA/cccl ci-workflow-pull-request.yml L463-L526.
|
/ok to test 170c799 |
This comment has been minimized.
This comment has been minimized.
|
/ok to test 739c499 |
Mirrors the NVIDIA#2209 reproducer on this branch's new aggregator. Expected outcome: - All Build * matrix entries fail at the env-vars build step. - Downstream Test * jobs are skipped (cascaded needs-failure). - Check job status now reports failure -- the symmetric, post-fix counterpart to NVIDIA#2209 (where Check job status reported success despite the same set of failures). DO NOT MERGE while this commit is present. Remove before merge. Refs NVIDIA#2208, NVIDIA#2209.
|
/ok to test 811388c |
Gating-check fix validated end-to-endThe new aggregator catches the exact failure mode that motivated #2208. Demonstrated by running the #2209 reproducer (intentional
Same reproducer, opposite verdict. The skipped-as-success hole is closed. The demo commit |
This reverts commit 811388c.
|
/ok to test 7d31a04 |
rwgk
left a comment
There was a problem hiding this comment.
Looks good to me and codex gpt-5.5:
The new check_result approach is a clear improvement over the previous cancelled || failure checks: it closes the skipped-as-success hole while keeping the intentional [doc-only] and [no-ci] cases explicit.
The one subtle behavior change I noticed is that fork or non-NVIDIA repo runs may now report a red final aggregator where jobs are skipped by owner-gated conditions. I agree that is not a concern for this PR, and it is preferable to avoid weakening the aggregator in a way that could reintroduce false-green CI.
|
Fixes #2208.
Replaces the
cancelled || failurepredicate in.github/workflows/ci.yml'schecks:job with CCCL'scheck_resultpattern: per-depexpected="success", withexpected="skipped"for legitimate[doc-only]skips, and an early short-circuit for[no-ci].Reference: https://github.com/NVIDIA/cccl/blob/8c0e6cb1b6412d7bd0070f9ac3f55fa80231961a/.github/workflows/ci-workflow-pull-request.yml#L463-L526
End-to-end validation (pre-fix vs post-fix reproducer): #2210 (comment)